On speaker adaptive training of artificial neural networks
نویسندگان
چکیده
In the paper we present two techniques improving the recognition accuracy of multilayer perceptron neural networks (MLP ANN) by means of adopting Speaker Adaptive Training. The use of the MLP ANN, usually in combination with the TRAPS parametrization, includes applications in speech recognition tasks, discriminative features production for GMM-HMM and other. In the first SAT experiments, we used the VTLN as a speaker normalization technique. Moreover, we developed a novel speaker normalization technique called Minimum Error Linear Transform (MELT) that resembles the cMLLR/fMLLR method [1] with respect to the possible application either on the model or features. We tested these two methods extensively on telephone speech corpus SpeechDat-East. The results obtained in these experiments suggest that incorporation of SAT into MLP ANN training process is beneficial and depending on the setup leads to significant decrease of phoneme error rate (3 % – 8 % absolute, 12 % – 25 % relative).
منابع مشابه
Forecasting Industrial Production in Iran: A Comparative Study of Artificial Neural Networks and Adaptive Nero-Fuzzy Inference System
Forecasting industrial production is essential for efficient planning by managers. Although there are many statistical and mathematical methods for prediction, the use of intelligent algorithms with desirable features has made significant progress in recent years. The current study compared the accuracy of the Artificial Neural Networks (ANN) and Adaptive Nero-Fuzzy Inference System (ANFIS) app...
متن کاملHourly Wind Speed Prediction using ARMA Model and Artificial Neural Networks
In this paper, a comparison study is presented on artificial intelligence and time series models in 1-hour-ahead wind speed forecasting. Three types of typical neural networks, namely adaptive linear element, multilayer perceptrons, and radial basis function, and ARMA time series model are investigated. The wind speed data used are the hourly mean wind speed data collected at Binalood site in I...
متن کاملEmbedding-Based Speaker Adaptive Training of Deep Neural Networks
An embedding-based speaker adaptive training (SAT) approach is proposed and investigated in this paper for deep neural network acoustic modeling. In this approach, speaker embedding vectors, which are a constant given a particular speaker, are mapped through a control network to layer-dependent elementwise affine transformations to canonicalize the internal feature representations at the output...
متن کاملOn the use of back propagation and radial basis function neural networks in surface roughness prediction
Various artificial neural networks types are examined and compared for the prediction of surface roughness in manufacturing technology. The aim of the study is to evaluate different kinds of neural networks and observe their performance and applicability on the same problem. More specifically, feed-forward artificial neural networks are trained with three different back propagation algorithms, ...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کامل